131 research outputs found

    Representations of specific acoustic patterns in the auditory cortex and hippocampus

    Get PDF
    Previous behavioural studies have shown that repeated presentation of a randomly chosen acoustic pattern leads to the unsupervised learning of some of its specific acoustic features. The objective of our study was to determine the neural substrate for the representation of freshly learnt acoustic patterns. Subjects first performed a behavioural task that resulted in the incidental learning of three different noise-like acoustic patterns. During subsequent high-resolution functional magnetic resonance imaging scanning, subjects were then exposed again to these three learnt patterns and to others that had not been learned. Multi-voxel pattern analysis was used to test if the learnt acoustic patterns could be 'decoded' from the patterns of activity in the auditory cortex and medial temporal lobe. We found that activity in planum temporale and the hippocampus reliably distinguished between the learnt acoustic patterns. Our results demonstrate that these structures are involved in the neural representation of specific acoustic patterns after they have been learnt

    Inhibition-excitation balance in the parietal cortex modulates volitional control for auditory and visual multistability

    Get PDF
    International audiencePerceptual organisation must select one interpretation from several alternatives to guide behaviour. Computational models suggest that this could be achieved through an interplay between inhibition and excitation across competing types of neural population coding for each interpretation. Here, to test for such models, we used magnetic resonance spectroscopy to measure non-invasively the concentrations of inhibitory Îł-aminobutyric acid (GABA) and excitatory glutamate-glutamine (Glx) in several brain regions. Human participants first performed auditory and visual multistability tasks that produced spontaneous switching between percepts. Then, we observed that longer percept durations during behaviour were associated with higher GABA/Glx ratios in the sensory area coding for each modality. When participants were asked to voluntarily modulate their perception, a common factor across modalities emerged: the GABA/Glx ratio in the posterior parietal cortex tended to be positively correlated with the amount of effective volitional control. Our results provide direct evidence implicating that the balance between neural inhibition and excitation within sensory regions resolves perceptual competition. This powerful computational principle appears to be leveraged by both audition and vision, implemented independently across modalities, but modulated by an integrated control process. Perceptual multistability describes an intriguing situation, whereby an observer reports random changes in conscious perception for a physically unchanging stimulus 1,2. Multistability is a powerful tool with which to probe perceptual organisation, as it highlights perhaps the most fundamental issue faced by perception for any reasonably complex natural scene. And because the information encoded by sensory receptors is never sufficient to fully specify the state of the outside world 3 , at each instant perception must always choose between a number of competing alternatives. In realistic situations, the process produces a stable and useful representation of the world. In situations with intrinsically ambiguous information, the same process is revealed as multistable perception. A number of theoretical models have converged to pinpoint the generic computational principles likely to be required to explain multistability, and hence perceptual organisation 4-9. All of these models consider three core ingredients: inhibition between competing neural populations, adaptation within these populations, and neuronal noise. The precise role of each ingredient and their respective importance is still being debated. Noise is introduced to induce fluctuations in each population and initiate the stochastic perceptual switching in some models 7-9 , whereas switching dynamics are solely determined by inhibition in others 5,6. Functional brain imaging in humans has provided results qualitatively compatible with those computational principles at several levels of the visual processing hierarchy 10. But, for most functional imaging techniques in humans such as fMRI or MEG/EEG, change

    The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement

    Full text link
    Supervised speech enhancement models are trained using artificially generated mixtures of clean speech and noise signals, which may not match real-world recording conditions at test time. This mismatch can lead to poor performance if the test domain significantly differs from the synthetic training domain. In this paper, we introduce the unsupervised domain adaptation for conversational speech enhancement (UDASE) task of the 7th CHiME challenge. This task aims to leverage real-world noisy speech recordings from the target test domain for unsupervised domain adaptation of speech enhancement models. The target test domain corresponds to the multi-speaker reverberant conversational speech recordings of the CHiME-5 dataset, for which the ground-truth clean speech reference is not available. Given a CHiME-5 recording, the task is to estimate the clean, potentially multi-speaker, reverberant speech, removing the additive background noise. We discuss the motivation for the CHiME-7 UDASE task and describe the data, the task, and the baseline system

    Effect of stimulus type and pitch salience on pitch-sequence processing

    Get PDF
    Using a same-different discrimination task, it has been shown that discrimination performance for sequences of complex tones varying just detectably in pitch is less dependent on sequence length (1, 2, or 4 elements) when the tones contain resolved harmonics than when they do not [Cousineau, Demany, and Pessnitzer (2009). J. Acoust. Soc. Am. 126, 3179-3187]. This effect had been attributed to the activation of automatic frequency-shift detectors (FSDs) by the shifts in resolved harmonics. The present study provides evidence against this hypothesis by showing that the sequence-processing advantage found for complex tones with resolved harmonics is not found for pure tones or other sounds supposed to activate FSDs (narrow bands of noise and wide-band noises eliciting pitch sensations due to interaural phase shifts). The present results also indicate that for pitch sequences, processing performance is largely unrelated to pitch salience per se: for a fixed level of discriminability between sequence elements, sequences of elements with salient pitches are not necessarily better processed than sequences of elements with less salient pitches. An ideal-observer model for the same-different binary-sequence discrimination task is also developed in the present study. The model allows the computation of d' for this task using numerical methods

    Insights on the Neuromagnetic Representation of Temporal Asymmetry in Human Auditory Cortex.

    Get PDF
    Communication sounds are typically asymmetric in time and human listeners are highly sensitive to this short-term temporal asymmetry. Nevertheless, causal neurophysiological correlates of auditory perceptual asymmetry remain largely elusive to our current analyses and models. Auditory modelling and animal electrophysiological recordings suggest that perceptual asymmetry results from the presence of multiple time scales of temporal integration, central to the auditory periphery. To test this hypothesis we recorded auditory evoked fields (AEF) elicited by asymmetric sounds in humans. We found a strong correlation between perceived tonal salience of ramped and damped sinusoids and the AEFs, as quantified by the amplitude of the N100m dynamics. The N100m amplitude increased with stimulus half-life time, showing a maximum difference between the ramped and damped stimulus for a modulation half-life time of 4 ms which is greatly reduced at 0.5 ms and 32 ms. This behaviour of the N100m closely parallels psychophysical data in a manner that: i) longer half-life times are associated with a stronger tonal percept, and ii) perceptual differences between damped and ramped are maximal at 4 ms half-life time. Interestingly, differences in evoked fields were significantly stronger in the right hemisphere, indicating some degree of hemispheric specialisation. Furthermore, the N100m magnitude was successfully explained by a pitch perception model using multiple scales of temporal integration of auditory nerve activity patterns. This striking correlation between AEFs, perception, and model predictions suggests that the physiological mechanisms involved in the processing of pitch evoked by temporal asymmetric sounds are reflected in the N100m

    Sons rugueux, sons tendus

    No full text
    cote interne IRCAM: Pressnitzer97b/National audienceSons rugueux, sons tendu

    L'approche du phénomÚne sonore

    No full text
    cote interne IRCAM: Pressnitzer97c/National audienceL'approche du phénomÚne sonor

    MĂ©moire et perception auditive

    No full text
    National audienceNous avons l’habitude de penser qu’entendre un son et se rappeler d’un son font appel Ă  des processus largement distincts. Pourtant, pour ĂȘtre entendu, tout son doit rencontrer les traitement cĂ©rĂ©braux dĂ©diĂ©s Ă  l’audition, qui sont façonnĂ©s par notre expĂ©rience accumulĂ©e au cours du temps. Durant ma prĂ©sentation, je ferai une brĂšve revue de donnĂ©es rĂ©centes qui suggĂšrent des liens Ă©troits entre mĂ©moire et perception : comment la simple Ă©coute de sons complexes entraĂźne presque inĂ©vitablement la formation d’une trace mnĂ©sique, et comment ce que nous venons d’entendre, le contexte, peut modifier profondĂ©ment la perception de traits Ă©lĂ©mentaires de l’audition tels que la hauteur

    What is a melody?:on the relationship between pitch and brightness of timbre

    Get PDF
    Previous studies showed that the perceptual processing of sound sequences is more efficient when the sounds vary in pitch than when they vary in loudness. We show here that sequences of sounds varying in brightness of timbre are processed with the same efficiency as pitch sequences. The sounds used consisted of two simultaneous pure tones one octave apart, and the listeners' task was to make same/different judgments on pairs of sequences varying in length (one, two, or four sounds). In one condition, brightness of timbre was varied within the sequences by changing the relative level of the two pure tones. In other conditions, pitch was varied by changing fundamental frequency, or loudness was varied by changing the overall level. In all conditions, only two possible sounds could be used in a given sequence, and these two sounds were equally discriminable. When sequence length increased from one to four, discrimination performance decreased substantially for loudness sequences, but to a smaller extent for brightness sequences and pitch sequences. In the latter two conditions, sequence length had a similar effect on performance. These results suggest that the processes dedicated to pitch and brightness analysis, when probed with a sequence-discrimination task, share unexpected similarities
    • 

    corecore